Tag reagent and assay method

ABSTRACT

This invention provides reagents, libraries and sets of the reagents, and assay methods using the reagents, the reagents comprising an analyte moiety and a tag moiety, wherein the tag moiety contains information defining the identify and location of the analyte residues of the analyte moiety which is detectable by mass spectrometry.

[0001] This is a continuation of Ser. No. 08/586,875, filed Feb. 5,1996, which is a 371 of PCT/GB94/01675, filed Aug. 1, 1994.

[0002] In biological and chemical analyses, the use of analyte moleculeslabelled with reporter groups is routine. This invention addresses theidea of providing reagents having at least two analyte groups linked toone or more reporter groups. Such reagents can be used, in waysdescribed below, to generate much more analytical information than cansimple labelled analytes. It is possible to code reporter groups so thatreagents carrying multiple analyte groups and multiple reporter groupscan by synthesised combinatorially and used simultaneously and thereporter groups resolved in the analytical stage.

[0003] WO 93/06121 (Affymax) describes a synthetic oligomer librarycomprising a plurality of different members, each member comprising anoligomer composed of a sequence of monomers linked to one or moreidentifier tags identifying the sequence of monomers in the oligomer.The linkage between the oligomer and the identifier tag preferablycomprises a solid particle. The identifier tag is preferably anoligonucleotide.

[0004] Proc. Natl. Acad Sci., Vol 89, No. 12, Jun. 15, 1992, pages5381-5383 (S Brenner and R A Lerner) describe encoded combinatorialchemistry for making a library of reagents each containing a geneticoligonucleotide tag.

[0005] In Rapid Communications in Mass Spectrometry, Vol 6, pages369-372 (1992) , G R Parr et al describe matrix-assisted laserdesorption/ionisation mass spectrometry of syntheticoligodeoxyribonucleotides.

[0006] In Nucleic Acids Research, Vol 21, No. 15, Jul. 25, 1993,pages3347-3357, E Nordhoff et al describe the ion stability of nucleicacids in infra-red matrix-assisted laser desorption/ionisation massspectrometry. to which the analyte moiety and the tag moiety are bothattached. Preferably the analyte moiety is a chain of n analyteresidues, and the tag moiety is a chain of up to n reporter groups, thereporter group at each position of the tag chain being chosen todesignate the analyte residue at a corresponding position of the analytechain. n is an integer of at least 2, preferably 3 to 20.

[0007] The invention may be used for the detection of all analytes ofinterest. These include, but are not limited to, a protein/peptide chainso that the analyte residues are amino acid residues; a nucleicacid/oligonucleotide chain so that the analyte residues are nucleotideresidues; a carbohydrate chain so that the analyte residues are sugarresidues. Additionally the analyte may be a class of small moleculeswith biological, pharmacological or therapeutic activity. For example itcould be a core molecule with the ability to vary various substituentgroups eg. alkyl, esters, amines, ethers etc in a combinatorial mannerwith mass spectrometry tags.

[0008] The tag moiety and/or the or each reporter group in it is capableof being observed/detected/ analysed so as to provide information aboutthe nature of the analyte moiety, and/or the analyte residues in it.

[0009] In one embodiment, the reagent has the formula A - L - R where Ais a chain of n analyte residues constituting the analyte moiety, L isthe linker, R is a chain of up to n reporter groups constituting the tagmoiety, and n is 2 - 20, wherein the tag moiety contains informationdefining the location of analyte residues in the analyte moiety.

[0010] The tag moiety consists of one or more reporter groupsdistinguishable by mass and thus capable of being analysed by massspectrometry. The reporter groups may be chemically different and thusdistinguished from one another by molecular weight. Or the reportergroups may be chemically identical, but distinguished from one anotherby containing different isotopes (e.g. ¹²C/¹³C and ¹H/²H as discussedbelow). The tag moiety is, and/or the reporter groups are, suitable oradapted for analysis by mass spectrometry e.g. after cleavage byphotochemical or other means from the reagent.

[0011] The advantages of mass spectrometry as a detection system are:its great sensitivity—only a few hundred molecules are needed to give agood signal; its wide dynamic range and high resolving power—moleculesin the mass range 100 to 200,000 Daltons can be resolved with aresolution better than 0.01; its versatility—molecules of many differentchemical structures are readily analysed; the potential to imageanalytes by combining mass spectrometry with, for example, scanninglaser desorption: and the ability to make quantitative as opposed tomerely qualitative measurements.

[0012] Thus mass-labelling combines advantages of radioactivity andfluorescence and has additional attributes which suggest novelapplications.

[0013] In another aspect, the invention provides a library of the abovereagents, wherein the library consists of a plurality of reagents eachcomprising a different analyte moiety of n analyte residues. Forexample, the library may consist of 4 reagents each comprising adifferent oligonucleotide chain of n nucleotide residues. The reagentsof the library may be present mixed together in solution.

[0014] In another aspect, the invention provides an assay method whichcomprises the steps of: providing a target substance; incubating thetarget substance with the said library of reagents under conditions tocause at least one reagent to bind to the target substance;

[0015] removing non-bound reagents; recovering the tag moieties of theor each bound reagent; and analysing the recovered tag moieties as anindication of the nature of the analyte moieties bound to the targetsubstance.

[0016] The target substance may be immobilised, as this provides aconvenient means for separating bound from non-bound reagent. In oneaspect, the target substance may be an organism or tissue or group ofcells, and the assay may be performed to screen a family of candidatedrugs. In another aspect, the target substance may be a nucleic acid,and this aspect is discussed in greater detail below.

[0017] Reference is directed to the accompanying drawings in which:

[0018]FIG. 1 is a general scheme for synthesis of reagents according tothe invention.

[0019]FIG. 2 shows reagents with three different systems of tag chainscontaining reporter groups.

[0020]FIG. 3a is a diagram showing synthesis of coded oligonucleotides,and

[0021]FIG. 3b is a diagram showing reading the code of a tag chain.

[0022]FIG. 4 is a diagram showing sequence analysis by progressiveligation.

[0023]FIG. 5 is a diagram on extending the sequence read byhybridisation to an oligonucleotide assay.

[0024] Legends to FIGS. 1 and 2 are included at the end of thisspecification.

[0025] Reference is directed to the example applications below,describing how the method may be applied to the analysis of nucleic acidsequences, and to screening candidate drugs.

[0026] Synthesis of Coded Tags

[0027] The principle of the method used for tagging multiple analytessimultaneously is similar to that proposed by Brenner and Lerner (1992)for coding peptides with attached nucleic acid sequences. The intentionof their idea is to add a tag which can be amplified by the polymerasechain reaction and read by sequencing the DNA molecule produced.

[0028] The structure of reagents is best illustrated by considering howthey could be made. Synthesis starts with a bivalent or multivalentlinker which can be extended stepwise in one direction to add a residueto the analyte and in another to add residue-specific reporter groups(FIG. 1). Suppose we wish to make a mixture of organic compounds,introducing different residues at each stage in the synthesis. Forexample, the mixture could comprise a set of peptides with differentamino acid sequences or of oligonucleotides with different basesequences, or a set of variants with potential pharmacological activitywith different groups attached to a core structure; in each case we wishto label each structural variant with a unique tag. This is done bydividing the synthesis at each step where different residues are addedto the compound of interest, and adding corresponding residues to thetag.

[0029] As an example, suppose we wish to make a mixture of 4096hexanucleotides, each with a unique tag. Four samples of a bivalentlinker would be coupled with each of the bases and with the uniquereporter for the base (FIG. 3a). The four samples are then mixed,divided in four and the process repeated. The result is a set ofdinucleotides each with a unique tag. The process is repeated until sixcoupling steps have been completed.

[0030] The Linker and Resorter Groups

[0031] The linker should have one group that is compatible with analytesynthesis—hydroxyl, amino or sulphydryl group are all suitable forinitiating oligonucleotide synthesis, and similar groups can be found toinitiate other pathways, for example, synthesis of polypeptides. Forsome classes of compounds it may be desirable to start with a “core”compound which forms part of the analyte. The choice of the group(s) forstarting addition of reporters depends on the nature of the reportergroups and the chemistry used to couple them. This chemistry has to becompatible with that used for synthesising the analyte. For the exampleof oligonucleotide synthesis, there are a number of alternatives. Theestablished method uses benzoyl and isopropyl groups to protect thebases, acid-labile trityl groups for temporary protection of the 5′-OHgroups during coupling, and β-cyanoethyl groups to protect thephosphates. The method used for coupling the reporters should not attackthese protecting groups or other bonds in the oligonucleotide, and thesynthesis of the tags should not be affected by the coupling, oxidation,and deprotection used in the extension of the oligonucleotide.

[0032] The coupling of the reporter monomers or the capping of thechain, may be incomplete at each step (FIG. 2, B and C) , so that theanalyte is coupled to a nested set of reporter structures. This willmake it easier to deduce the structure of the analyte from thecomposition of the tag (FIG. 1; FIG. 3). To make the synthesis easier itis desirable for the linker to be attached to a solid support by alinkage which can be cleaved without degrading the analyte or thereporter groups. Alternatively, the linker may carry a group such as acharged group or a lipophilic group which enables separation ofintermediates and the final product from reagents.

[0033] The reporter groups could take many forms, the main considerationis the need to read the composition or sequence of the tag by massspectrometry. Possibilities include groups with different atomic orformula weights, such as aliphatic chains of different lengths ordifferent isotopic composition. Using isotopically labelled methylenegroups, it is possible to assign a group of unique formula weight toeach of four different reporters (Table 1). TABLE 1 Example reportersbased on isotopes of hydrogen and carbon: Isotopic Formula WeightComposition (of —OCH₂) Symbol Base ¹²CH₂ 30 r₃₀ A ¹²CHD, ¹³CH₂ 31 r₃₁ C¹²CD₂, ¹³CHD 32 r₃₂ G ¹³CD₂ 33 r₃₃ T

[0034] Taking the example of oligonucleotides these tags can make a setwhich allows the base at each position in the oligonucleotide to be readfrom the incremental masses of the partial products in the series (Table2). All oligonucleotide sequences will give a unique series of tagfragment weights provided the smallest increment in adding a reporter islarger than the mass difference between the smallest and the largestreporter. TABLE 2 Example oligonucleotide with isotopic reporters:G-A-T-C-T-A . . . P-r₃₀-r₃₃-r₃₁-r₃₃-r₃₀-r₃₂ $\left. \begin{matrix}{{Formula}\quad {weights}} \\{{of}\quad {partial}} \\{products}\end{matrix} \middle| {F_{p} + 30_{s} + 63_{i} + 94_{g} + 127_{s} + 157_{p} + 190} \right.$

[0035] For mass spectrometry, it will be desirable to have a simple wayof cleaving the tag chain from the analyte. There are severalpossibilities. Among methods compatible with oligonucleotide and peptideanalytes are: light induced cleavage of a photolabile link; enzymaticcleavage, for example of an ester link; free-radical induced cleavage.

[0036] A further requirement is that the tags should be compatible withthe chemical and biochemical processes used in the analysis: for theexample of oligonucleotides used in molecular hybridisation or for oneof the proposed sequencing methods, they must be soluble and they mustnot inhibit certain enzymatic reactions which may be used in theanalysis. Experience has shown that oligoethylene glycol linkages,similar to the methylene analogues shown in Table 1, are compatible withmolecular reassociation of oligonucleotides. Furthermore, such linkagesare compatible with at least some enzymatic reactions as we have shownthat oligonucleotides tethered to glass through a hexaethylene glycollinker can be converted to a 5′-phosphomonoester by treatment withpolynucleotide kinase and ATP.

[0037] Desirable Properties of the Linker

[0038] For the applications envisaged, it is desirable that the linkermolecule has the following properties:

[0039] It should be possible to link it to a solid support to allow forsynthetic cycles to produce the analyte and corresponding tags toproceed without the need for cumbersome purification of intermediates.Following synthesis cycles, the linker should be removable from thesolid support under conditions which leave the analyte and tags intact.The functional group for tag synthesis should be such that it allows forthe ready synthesis of tags which are distinguishable from each other bymass spectrometry.

[0040] The linker should have protected functional groups that allow forthe extension of the analyte and the tags separately, under conditionsin which the chemistry for one does not interfere with that of theother.

[0041] The linker should preferably carry a charged group so that massspectrometry can be carried out in the absence of a matrix. Further tothis aim, it is desirable that the tags should comprise compounds whichare volatile enough to evaporate in the mass spectrometer, withoutrecourse to complex techniques such as the electrospray. The tags shouldeither produce stable ions or ions which fragment to characteristicpatterns that can be used to identify the corresponding analyte.

[0042] The link between the tag and analyte should preferably bephotocleavable, so that tags can be directly cleaved in the massspectrometer by laser irradiation, and further cleavage to remove themcompletely to allow biochemical steps such as ligation, can be carriedout conveniently by exposure to a lamp.

[0043] The linked products should preferably be soluble in aqueoussolvents, so that they can be used in biochemical reactions.

[0044] The examples described herein show linkers with these desiredproperties.

[0045] Photocleavable Group

[0046] The photocleavable group has been based on the known photolabileo-nitrobenzyl group. This group has been used as a protecting group forboth the phosphate group and 2′ hydroxy group in oligo nucleotidesynthesis [see the review by Pillai Synthesis 1 (1980)]. In itself theo-nitrobenzyl group lacks further functionalisation for subsequentattachment of a linker between tags and analyte. Available fromcommercial sources is the compound 5-hydroxy-2-nitrobenzyl alcohol. Itis known that OMe groups can be added in the 5,4 position withoutsignificant reduction in photolabile properties (see Pillai review).Thus, the 5-hydroxy-2-nitrobenzyl alcohol was used as a starting pointwith the aim of extending DNA synthesis from the benzyl alcohol and thelinker chain to the tags from an ether coupling at the 5-hydroxy group.

[0047] The requirement is for a functional group to be present to permitthe combinatorial synthesis of analytes and tags. A linker arm istherefore required from the photocleavable group to the requiredfunctional group for tag synthesis. It is also a preferment that thecombinatorial synthesis be carried out on a solid support. Thus, thelinker arm must be bivalent in functional groups and have orthogonalprotecting groups to permit selective synthetic transformation.Preferred tag reagents contain glycol linkages/ether linkages. Forsynthesis oligonucleotides are normally linked to a long chain amino CPGsupport via the 3′ hydroxy and a succinic ester link. Thus thefunctional groups required were deemed to be alcohols.

[0048] The following intermediate compound has been synthesised.

[0049] This comprises an aromatic linker carrying:

[0050] a methoxytrityl group (—CH₂ODMT) for analyte synthesis;

[0051] an o-nitro group for photocleavage;

[0052] an O-t-butyl diphenyl silyl group (OTBDPS) for tag synthesis;

[0053] a tertiary amine group for conversion to a positively chargedgroup for analysis by mass spectrometry;

[0054] and an N-hydroxysuccinimidyl group for attachment to a support.

[0055] When the analyte is a peptide only minor modifications toconditions need be considered. The 2-nitrobenzyl group is stable undermost of the conditions of peptide synthesis and it and related analogueshave already been used as photo labile groups in peptide synthesis (seeFillai review and the references contained therein). There are alreadyseveral resins suited to peptide synthesis with different modes ofcleavage. The orthogonal protecting groups for analyte and tag synthesiswould be based on t-butoxycarbonyl and 2-methoxyethoxymethyl. Thet-butoxycarbonyl group would be used to protect the amino group in theamino acids with cleavage being effected by a trifluoroacetic acidtreatment. The 2-methoxyethoxymethyl would be used to protect thetagging groups and the tags based on mass diffentiated on 1, n alkyldiolderivatives as before. The cleavage of t-butoxycarbonyl groups has beenshown to be compatible with the 2-methoxyethoxymethyl protecting groups.The 2-methoxyethoxymethyl protecting groups can be selectively cleavedwith zinc bromide in dichloromethane. While the above illustrates theprocedure those skilled in the art will recognise that this set oforthogonal protecting groups is by no means limiting but serves as arepresentative example.

[0056] Detection and Analysis of Reporters.

[0057] Photocleavage is the favoured method of releasing tags fromanalytes; it is fast, can be carried out in the dry state, and scanninglasers can be used to image at a very small scale, small enough to imagefeatures within cells (de Vries et al., 1992), so that the proposedmethod could be used to detect the positions of specific analytes thathad been used to “stain” the surface or the insides of cells, ordifferent cells in a tissue slice, such as may be required to imageinteractions between ligands, e.g. candidate drugs, and their receptors.

[0058] Photosensitive protecting groups are available for a very widerange of chemical residues [reviewed in Pillai, 1980]. The photolabileo-nitro benzyl group which can be used as a protecting group for a widerange of compounds forms an ideal starting point for a linker for manyanalytes that could be envisaged, peptides and oligonucleotides amongthem. Taking the example of oligonucleotides, it provides aphotosensitive link that can be broken quantitatively to give a hydroxylgroup. This will permit the deprotected oligonucleotide to take place inthe ligation extension as described in the sequencing method below.Furthermore, the group is known to be stable during oligonucleotidesynthesis. It would be necessary to modify the benzyl ring to provide agroup that can be used to initiate the synthesis of the tags; reporterssuch as the oligoethyleneglycol series described above do not interferewith the photochemical cleavage reaction of the o-nitrobenzoyl group(Pillai op, cit.). Other groups can be added to the aromatic ring whichenhance the cleavage; such groups could be exploited to add a chargedgroup(s) to simplify analysis in the mass spectrometer. Modern massspectrometers are capable of measuring a few hundred molecules with aresolution better than one Dalton in a hundred, up to a total mass of200 kD. A preferred photolabile linker may be represented thus; in whichthe positively charged group R may be directly attached to the aromaticring or may be present in one of the linker arms:

[0059] Instrumentation.

[0060] The proposed molecular tags would be analysed by one of severalforms of mass spectrometry. For many purposes, although it will bedesirable to cleave the tags from the analytes, it will not be necessaryto fragment the tags, and indeed it may be undesirable as it could leadto ambiguities. Recent developments in mass spectrometry allow themeasurement of very large molecules without too much fragmentation; andas it is possible to design the linker so that it is readily cleaved,under conditions where the rest of the tag is stable, fragmentation ofthe tag during measurement should be avoidable. The analyte group will,in most cases, be less volatile than the tag, and in many applicationswill be bound to a solid substrate, and thus prevented from interferingwith mass spectrometry.

[0061] The linker illustrated above is very labile to photon irradiationunder conditions which will cause no cleavage of the great majority ofcovalent chemical bonds. A suitable instrument has been described [deVries et al., 1992). This uses a laser that can be focussed down to aspot smaller than 1 μm. Images of up to 250 mm are scanned by moving astage that can be positioned to 0.1 μm.

[0062] This instrument also allows for ionisation of the species to bemeasured by shining an ionising laser across the surface of the stage sothat it interacts with the species lifted by the desorption laser. Thiscould be useful for the present method if it were not possible toinclude a charged residue in the tags, or if fragmentation is desirablefor reading the tags.

[0063] In another aspect the invention provides a method of sequencing atarget nucleic acid, which method comprises the steps of:

[0064] a) providing an oligonucleotide immobilised on a support,

[0065] b) hybridising the target nucleic acid with the immobilisedoligonucleotide,

[0066] c) incubating the hybrid from b) with the library as defined inwhich the reagents are mixed together in solution, so that anoligonucleotide chain of a first reagent of the library becomeshybridised to the target nucleic acid adjacent the immobilisedoligonucleotide,

[0067] d) ligating the adjacent oligonucleotides, thus forming a ligatedfirst reagent,

[0068] e) removing other non-ligated reagents, and

[0069] f) recovering and analysing the tag moiety of the ligated firstreagent as an indication of the sequence of a first part of the targetnucleic acid.

[0070] Example Applications

[0071] We illustrate potential applications by referring to ways inwhich coded oligonucleotides could be used in nucleic acid analysis.

[0072] 1. Nucleic acid sequence determination by progressive ligation.(FIG. 4)

[0073] The sequence to be determined is first hybridised in step b) toan oligonucleotide attached to a solid support. If the DNA to besequenced has been cloned in a single strand vector such asbacteriophage M 13 , the “primer” oligonucleotide on the solid supportcan be chosen to be part of the vector sequence. In step c), the solidsupport carrying the hybrids from step b) is incubated with a solutionof the coded oligonucleotide reagents, e.g. with the aforesaid library,comprising all sequences of a given length, say 4096 hexanucleotides(4^(n) n-mers, in general). In step d), ligase is introduced so that thehexanucleotide complementary to the first six bases in the target DNA isjoined to the immobilised primer oligonucleotide. By this step a firstcoded oligonucleotide reagent from the library is joined, by ligation ofits oligonucleotide chain to the immobilised primer oligonucleotide, andis herein referred to as a ligated first reagent.

[0074] In step e), non-ligated reagents are removed, e.g. by washing. Instep f), the linker of the ligated first reagent is broken to detach thetag chain, which is recovered and analysed as a indication of thesequence of a first part of the target DNA.

[0075] Preferably, removal of the linker also exposes a hydroxyl orphosphate group at the end of the first oligonucleotide chain, making itavailable for ligation with the oligonucleotide chain of a secondreagent. Several methods for breaking the linker, includingphotochemical and enzymatic and chemical hydrolysis, can be used togenerate the 3′-hydroxyl or 5′-phosphate group needed for furtherligation. Steps c), d), e) and f) are then repeated. These steps involvehybridisation of a second reagent from the library, ligation recoveryand analysis of the tag chain of the ligated second reagent, andgeneration of another 3′-hydroxyl or 5′-phosphate group needed forfurther ligation. The process can be repeated until the whole DNAsequence has been read or until yields in the reaction become too low tobe useful.

[0076] Four stages of this sequence are shown diagrammatically in FIG.4. The first diagram corresponds to the situation at the end of step e)first time round. The second diagram corresponds to the situation at theend of step f). The third diagram corresponds to the position at the endof step c) second time round. The fourth diagram corresponds to thesituation at the end of step d) second time round. The cyclic nature ofthe technique is indicated.

[0077] 2. Nucleic acid sequencing of multiple templates by sequentialligation.

[0078] In an extension of the first example, it is envisaged that manysequences could be analysed simultaneously. For example, individualclones of the DNA to be sequenced could be immobilised:

[0079] a) Use can be made of an array of pins with the same vectoroligonucleotide immobilised on the end of each. An individual clone ofthe target DNA is hybridised to the oligonucleotide immobilised on eachindividual pin. The array of pins carrying these hybrids is thenincubated with the library of coded oligonucleotide reagents in asolution which also contains the ingredients for ligation. As a resultof this step, each pin carries a different ligated reagent. Finally, thetag chain of each ligated reagent is recovered and analysed as before.If the pins of the array are suitably spaced, they may be dipped intothe wells of microtitre plates, the first plate containing the templatesto be sequenced, the second the library of reagents and ligationsolution, and the third plate containing a reagent for cleaving the tagchains from the pins.

[0080] b) Alternatively, a surface may be coated with the primeroligonucleotide, preferably covalently attached through its 5′ end oralternatively at some other point. Individual clones of the DNA to besequenced are spotted at spaced locations on the coated support, so thateach individual clone of the target DNA is hybridised to theoligonucleotide immobilised at an individual spaced location on thesupport. The support is then incubated with a solution containing thelibrary of reagents and the ingredients for ligation. Non-ligatedreagents are removed. Then the linker of the ligated reagent at eachspaced location is cleaved and the tag recovered and analysed. Cleavageis preferably effected by a method such as laser desorption which canaddress small areas on the surface. An advantage of this approach isthat very large numbers of DNA sequences can be analysed together.

[0081] 3. Extension of methods for sequence determination byhybridisation to oligonucleotides

[0082] a) Format I.

[0083] Methods for spotting DNAs at high density on membranes are wellestablished [Hoheisel et al., 1992; Ross et al., 1992]. Forfingerprinting and for sequence determination, oligonucleotides must beapplied either singly or in small sets so that the hybridisationpatterns are not too complex to interpret; as a consequence, only asmall proportion of templates give signal at each round of analysis. Ifthe signal from each hybridisation contained coded information whichallowed its sequence to be determined, more complex mixtures could beused and much more information collected at each round of hybridisation.The complexity of the mixture would depend on the length of the DNAtemplates and on the ability of the analytical method to resolvesequences in mixed oligonucleotides.

[0084] Nucleic acid probes encoded with these mass spectrometry tags orreporter groups will be very valuable where the use of multiple probesis advantageous eg. DNA fingerprinting or mutation analysis. The massspectrometry tags offer the advantage of multiplexing.

[0085] A number of different probes each labelled with its own uniqueand appropriate mass spectrometry tag can be used together in typicalnucleic acid hybridisation assays. The sequence of each individual probewhich hybridises can be uniquely determined in the presence of othersbecause of the separation and resolution of the tags in the massspectrum.

[0086] In this aspect, the invention provides a method of sequencing atarget nucleic acid, which method comprises the steps of:

[0087] i) providing the target nucleic acid immobilised on a support.Preferably individual clones of the target nucleic acid are immobilisedat spaced locations on the support.

[0088] ii) incubating the immobilised target nucleic acid from I) with aplurality of the coded oligonucleotide reagents described above, so thatthe oligonucleotide chains of different reagents become hybridised tothe target nucleic acid on the support,

[0089] iii) removing non-hybridised reagents, and

[0090] iv) recovering and analysing the tag moiety of each reagent as anindication of the sequence of a part of the target nucleic acid.

[0091] Preferably thereafter use is made of the library of reagents,with the hybridisation, ligation, cleavage and analysis steps beingrepeated cyclically to provide additional information about the sequenceof the target nucleic acid.

[0092] b) Format II.

[0093] It is possible to determine nucleic acid sequences from thepattern of duplexes formed when they are hybridised to an array ofoligonucleotides. The length of sequence that can be determined isapproximately the square root of the size of the array: if an array ofall 65,536 octanucleotides is used, the sequences to be determinedshould be around 200 bp [Southern et al., 1992]. The limit in size isimposed by the constraint that no run of eight bases should occur morethan once in the sequence to be determined. The array and its use insequence determination are described in International patent applicationWO 89/10977; and a method of providing an array of oligonucleotidesimmobilised e.g. by their 5′-ends or their 3′-ends on a surface isdescribed in International application WO 90/03382.

[0094] By the method of the present invention, the sequence length thatcan be determined can be greatly extended. In this aspect of theinvention, the method comprises the steps of:

[0095] a) Providing an array of oligonucleotides immobilised at spacedlocations on a support, the oligonucleotide at one location beingdifferent from oligonucleotides at other locations. Preferably thesequence is known of the oligonucleotide immobilised by a covalent bondat each spaced location on the support,

[0096] b) incubating the target nucleic acid with the array ofimmobilised oligonucleotides, so as to form hybrids at one or morespaced locations on the support,

[0097] c) incubating the hybrids from b) with the library of codedoligonucleotide reagents, so that an oligonucleotide chain of a reagentof the library becomes hybridised to the target nucleic acid adjacenteach immobilised oligonucleotide,

[0098] d) ligating adjacent oligonucleotides thus forming ligatedreagents at the one or more spaced locations on the support,

[0099] e) removing other non-ligated reagents, and

[0100] f) recovering and analysing the tag moiety of each ligatedreagent as an indication of the sequence of a part of the target nucleicacid.

[0101] Preferably cleavage of the tag chain at each spaced location iseffected photochemically by means of a laser. Preferably analysis of thetag chains is by mass spectrometry. Preferably the hybridisation,ligation, cleavage and analysis steps are repeated cyclically, asdescribed above, so as to obtain additional information about thesequence of the target nucleic acid.

[0102] A preferred sequence of operations is shown in the four diagramsconstituting FIG. 5. The first diagram shows the position at the startof step b). The second diagram shows the position at the end of stepb)—a portion of the target nucleic acid has become hybridised to atethered oligonucleotide forming part of the array. The third diagramshows the position at the end of step c), and the fourth diagram showsthe position at the end of step d); a reagent from the library hasbecome hybridised to the target nucleic acid and ligated to theimmobilised oligonucleotide.

[0103] The results of this extension of the known method are dramatic. Asingle extension by a length equal to the length of the oligonucleotidesin the array squares the overall length that can be read, provided thatthe method used to read the tags can resolve mixtures. In this case thelength that can be read from an array of octanucleotides extended byeight bases is around 60,000 bases.

[0104] Comparison of hybridisation analysis with tagged oligonucleotideswith:

[0105] a) Gel-based methods.

[0106] The most advanced instrument for automated sequence analysis iscapable of reading around 40000 bases per day. This does not include thetime for the biological and biochemical processes needed to provide thereactions that are loaded on the gel. If we assume that templates can beapplied to a surface at a density of one per square millimeter [Hoheiselet al., 1992; Ross et al., 1992], 10000 could be applied to an area of100×100 mm. After hybridisation, there would be several fmol of taggedoligonucleotide in each cell so a single 2 nsec pulse of the laser mayrelease enough tag to read, but even if we assume that 100 pulses areneeded, then the total time for a cell to be read is a few msec, so thatall 10000 cells could be read in a few minutes. If the oligonucleotideswere hexamers, the raw data acquired would be 60000 bases. For sequencedetermination, this would not be as informative as the equivalent rawdata from a gel, because much longer continuous lengths are read fromgels. This advantage for gels would, of course, be lost if the sequenceread from the array could be extended by further rounds of analysis. Butthe fundamental advantage of array-based approaches is the parallelismwhich enables thousands of templates to be analysed together; the numberthat can be analysed on a gel is limited by the width of the gel to lessthan fifty.

[0107] b) Present array-based methods.

[0108] The major drawbacks of existing array-based methods are:

[0109] a) The sequence that can be read from an array of size N is only≈{square root}{square root over (N)}, so that most cells of the arrayare empty. By adding tagged oligonucleotides, the occupancy of the arraycould be near complete, so that information would be obtained from mostcells. The reason for this is that additional information from the tagshelps remove ambiguities due to multiple occurrences of short strings inthe target sequence (Table 3).

[0110] b) The length of sequence that is read from each interaction withan oligonucleotide by hybridisation is necessarily limited to the lengthof the oligonucleotide. This causes problems in reading throughrepeating sequences, such as runs of a single base. Extending the readby ligation will permit reads as long as can be traversed by repeatedlegations.

[0111] c) Of present detection methods, radioactivity has highsensitivity but poor resolution, fluorescence has low sensitivity andhigh resolution; both are relatively slow. The proposal to use massspectrometry could improve resolution, speed and sensitivity, as well asadding the potential to read the sequences of tags. TABLE 3 In general,the sequence that can be determined from templates distributed on aspatially segmented array is ≈✓4^(L) = 2^(L), where L is the sum of thecontinuous lengths read by oligonucleotides. This would include thelength of the oligonucleotide on the solid support in example 3b but notin example 2. L 2^(L) 12 4096 14 16384 16 65536 18 2262144

[0112] Analytes With Potential Pharmacological Activity

[0113] Many drugs are tissue-specific. Their action often depends oninteraction with a cell-surface receptor. There are families of drugsbased on core structures; for example, there are several comprisingshort peptides. It is useful to be able to trace candidate drugs to seewhich cells or tissues they may target. It would be useful to be able totrace many different candidates simultaneously. Using libraries ofanalytes tagged with coded mass-tags, it would be possible to traceinteractions by examining cells or tissues in the mass spectrometer. Iftags were attached by photolabile protecting groups, it would bepossible to image whole animal or tissue sections using scanning lasercleavage, coupled with mass spectrometry.

[0114] The following Examples further illustrate the invention.

[0115] Examples 1 to 6 show steps, according to the following ReactionScheme 1, in the synthesis of a compound (8) comprising an aromaticlinker carrying: a methyloxytrityl group (—CH₂ODMT) for analytesynthesis; an o-nitro group for photocleavage; an O-t-butyl diphenylsilyl group (OTBDPS) for tag synthesis; a tertiary amino group forconversion to a positively charged group for analysis by massspectrometry; and an N-hydroxysuccinimidyl group for attachment to asupport.

[0116] Examples 7 and 8 show subsequent steps according to the followingReaction Scheme 2.

[0117] Examples 9 and 10 show steps, according to the following reactionScheme 3, of preparing reporter groups (13) based on propan-1,3-diol.

[0118] Examples 11 to 13 show steps, according to the following ReactionScheme 4, involved in attaching a protected propan-1,3-diol residue as areporter group to compound (6).

[0119] Examples 14 to 19 describe the preparation, characterisation anduse of various reagents according to the invention.

[0120] General Detail

[0121] 5-Hydroxy-2-nitrobenzyl alcohol was purchased from Aldrich, longchain alkylamino controlled pore glass from Sigma. Anhydrous solventsrefer to Aldrich Sure Seal grade material packed under Nitrogen.Triethylamine was predistilled from calcium hydride and stored undernitrogen prior to use. Other solvents and reagents are available from arange of commercial sources.

[0122]¹H NMRs were obtained on a Jeol 270 MHz machine using the solventindicated and referenced to tetramethylsilane.

[0123] Infra Reds were obtained on a Nicolet 5DXC F.T. IR machine eitheras a potassium bromide disc or chloroform solution as indicated.

[0124] Melting points were obtained on a Gallenkamp melting pointapparatus and are uncorrected.

[0125] Tics were run on Kieselgel 6OF₂₅₄ aluminium backed Tlc platesusing the solvent system indicated. The plates were visualised by bothultra violet and/or dipping in a 3% w/v ethanolic solution ofmolybdophosphoric acid and then heating with a hot air gun. Tritylcontaining species show up as a bright orange spot, alcohols as a bluespot.

[0126] Silica gel chromatography was performed using flash grade silicagel, particles size 40→63 μm.

[0127] Abbreviations used in the reaction schemes and text. DMT4,4′-dimethoxytrityl THF tetrahydrofuran TBDPS tert-butyldiphenylsilaneDMAD 4-dimethylaminopyridine DCCI dicyclohexyldicarbodiimide CH₂Cl₂dichloromethane CPG controlled pore glass MeI iodomethane Tresyl2,2,2-trifluoroethlsulphonyl

EXAMPLE 1 Synthesis or 5-hydroxy -O-(4,4′- dimethoxytrityl)-2-nitrobenzyl alcohol (Compound 2, Scheme 1)

[0128] To 5-hydroxy - 2 - nitrobenzyl alcohol (5.11 g, 30.2 mmol)dissolved in anhydrous pyridine (40 ml) was added 4,4′- dimethoxytritylchloride (10.25 g, 30.2 mmol) and the flask stoppered. The reactionmixture was then left to stir at room temperture for a total of 72hours. T.l.c. analysis (ether/pet. ether 40 - 60° C., 65 %/35%) revealedthe presence of a new trityl positive containing material with an R_(F)of 0.27 and disappearance of the starting alcohol. The pyridine was thenremoved by rotary evaporation, with the last traces being removed bycoevaporation with toluene (x2). The resultant gum was dissolved up inethyl acetate and the solution washed with water (x1) and brine (x1).The ethyl acetate solution was then dried over anhydrous magnesiumsulphate and evaporated to a reddish brown gum. The gum was dissolved inCH₂Cl₂ (20 ml) and then applied to a silica gel column (14 cm×6.5 cm)which was eluated with ether/ pet. ether 40 - 60° C., 65%135%. Theproduct fractions were combined and the solvent removed by rotaryevaporation to give an off white solid (13.49 g, 95%, mpt. 80 - 82° C.with decomposition). An analytical sample was prepared byrecrystallisation from chloroform/ pet. ether 40 - 60° C., mpt. 134 - 7°C. with decomposition.

[0129]¹H NMR (270 MHz, CDCl₃ δ): 3.79 (s, 6H, DMT-OCH), 4.63 (s, 2H,CH₂-ODMT), 6.77 -6.85 (m, 5H, aryl), 7.22 - 7.49 (m, 9H, aryl), 7.63 (s,1H aryl), 8.06 (d, 1H, J=9.06 Hz, aryl).

[0130] IR (KBr disc), 1610, 1509, 1447, 1334, 1248, 1090, 1060, 1033,828 cm⁻¹.

EXAMPLE 2 Synthesis of O-(4,4′-dimethoxytrityl)-5-[1-(3-bromo1-oxypropyl)]-2-nitrobenzylalcohol(Compound 3, Scheme 1)

[0131] To compound 2 (10.18 g, 21.6 mmol) dissolved in acetone (150 ml)was added 1,3-dibromopropane (11 mls, 108 mmol) and potassium carbonate(4.47 g, 32.3 mmol). The reaction mixture was then heated at 80° C. fora total of three hours and then stirred at room temperature for afurther 16 hours. T.l.c. analysis (ether/pet. ether 40→60° C., 60%140%)showed complete disappearance of the starting material and the formationof two new trityl containing species; R_(F) 0.48 major, R_(F) 0.23minor. The acetone was then removed by rotary evaporation and theresultant residue partitioned between water and dichloromethane. Thedichloromethane solution was separated and washed with brine. Thedicholormethane solution was then dried over anhydrous magnesiumsulphate and evaporated down to a gum. The gum was dissolved indichloromethane 20 ml and then applied to a silica gel column (6.5 cm×14cm) which was eluated with ether/pet. ether 40 - 60° C., 60%140%. Thepure product fractions were combined and the solvent removed by rotaryevaporation to give compound 3 as a white solid (8.18 g, 64%, mpt.132→4° C., R_(F) 0.48 ether/pet ether 40 -60° C., 60%/40%. A smallsample was recrystallised from ethyl acetate/pet. ether for analyticalpurposes, mpt. 132-4° C.

[0132]¹H NMR: (270 MHz CDCl₃, δ): 2.40 (m, 2H, —CH₂—CH₂—CH₂—), 3.64 (t,2H, J=6.32 Hz, CH₂Br), 3.79 (s, 6H, OCH₃), 4.27 (t, 2H, J=6.04 Hz,—OCH₂CH₂), 4.66 (s,2H, Ar CH₂ ODMT), 6.84 (d, 4H, J=8.79 Hz, DMT aryl),7.20 - 7.50 (m, 10H,9 DMT aryl, 1 aryl) 7.68 (s, 1H, aryl), 8.1 (d, 1H,J=9.06 Hz, aryl).

[0133] IR (KBr disc) 1608, 1577, 1511, 1289, 1253, 1230, 1174, 1065,1030 cm⁻¹.

EXAMPLE 3 Synthesis of N-[O-(tert - butyldiphenylsilyl)-2-oxyethyl)]-N-(2-hydroxyethyl) amine (Compound 5, Scheme 1)

[0134] To sodium hydride (0.76 g of a 60% dispersion in oil, 19 mmol)under N₂ was added anhydrous THF (15 ml)) followed by a slurry ofdiethanolamine (2g, 19 mmol) in THF (30 ml) at such a rate as theevolution of hydrogen permitted. The reaction mixture was then stirredat room temperature for 30 minutes under N₂ during which time a greyprecipitate formed. The generated alkoxide was quenched by the additionof ten-butylchlorodiphenylsilane (4.95 ml, 19 mmol) followed by stirringthe reaction at room temperature for two hours under N₂. T.l.c. analysis(ethyl acetate) showed the generation of two new UV positive spotsrelative to starting material, major R_(F) 0.05 minor R_(F) 0.60. TheTHF was removed by rotary evaporation and the residue dissolved in a 0.1M sodium bicarbonate solution. The product was then extracted with ethylacetate (x2). The ethyl acetate extracts were then combined and washedwith brine (x1). The ethyl acetate solution was then dried overanhydrous magnesium sulphate and evaporated down to an oil. This oil wasapplied to a silica gel column which was elulated with achloroform/methanol, 90%/10% Fractions with an R_(F) of 0.33 werecombined and rotary evaporated to give compound 5 as a white crystallinesolid (3.93 g, 60%o, mpt. 73→75° C.). A small sample was recrystallisedfrom ethyl acetate/ pet. ether 40 - 60° C. for analytical purposes, mpt.76→77° C.

[0135]¹H NMR (270 MHz, CDCl₃, δ): 1.06 (s, 9H, Bu), 2.13 (brs, 1H,OH,D₂O exchangable), 2.78 (m, 4H, CH₂NHCH⁻²), 3.63 (t, 2H, J=5.22 Hz,—CH₂OSi—), 3.78 (t,2H, J=5.22 Hz, CH₂ OH), 7.40 (m, 6H, aryl), 7.66 (m,4H, aryl).

[0136] IR (KBr disc) 3100, 1430, 1114, 1080, 969, 749, 738, 707 cm⁻¹.

EXAMPLE 4 Synthesis ofN-[O-(tert-butyldiphenylsilyl)-2-oxyethyl]-N-[O-(3(0-(4,4′-dimethoxytrityl)-1-oxyethyl)4-nitrophenyl)-3-oxypropyl]-N-(2-hydroxyethyl)amine (Compound 6, Scheme 1)

[0137] To compound 3 (7.46 g, 12.6 mmol) dissolved in1-methyl-2-pyrrolidinone (65 ml) was added compound 5 (8.63 g, 25.2mmol). The reaction mixture was then heated at 80° C. for a total of 5hours before being left to cool and stir at room temperature for afurther 16 hours. T.l.c. analysis (ethyl acetate) showed the formationof a new trityl containing species, R_(F) 0.51 and residual amounts ofthe two starting materials. The reaction mixed was poured into a mixtureof water (600 ml) and brine (100 ml) and the product extracted withethyl acetate (3×200 ml). The ethyl acetate extracts were then combinedand dried over anhydrous magnesium sulphate. The ethyl acetate was thenremoved by rotary evaporation to give a brown gum from which acrystalline product slowly formed. The minimum amount of ethyl acetatewas added to dissolve up the residual gum such that the crystallineproduct could be filtered, the hydrogen bromide salt of compound 5. Theethyl acetate solution was then applied to a silica gel column (13cm×6.5 cm) which was eluted with ethyl acetate. Insufficient separationof residual compound 3 and the desired product was obtained from thiscolumn so the product fractions were combined and evaporated to a gum.This gum was dissolved up in the minimum of ethyl acetate necessary andapplied to another silica gel column (14 cm×6.5 cm) eluting using agradient eluation, first ethyl acetate/pet. ether 40 →60° C., 50%/50%followed by ethyl acetate. The product fractions were combined and thesolvent removed by rotary evaporation to give compound 6 as a gum. Thelast traces of solvent were removed by placing the gum under high vacuumfor one hour. The yield of product was 7.64 g, 71%.

[0138]¹H NMR (270 MHz, CDCl₃, δ): 1.04 (s, 9H, ¹Bu), 1.97 (m, 2H,—CH₂CH₂CH₂—), 2.7 (m, 6H, NCH₂), 3.56 (m, 2H, CH₂OH), 3.75 (m, 2H,CH₂OSi), 3.78 (s, 6H, DMT-OCH₃), 4.12 (m,2H, ArOCH₂CH₂), 4.64 (s, 2H,ArCH₂ODMT, 6.74 - 6.85 (m,5H,aryl) 7.2 - 7.65 (m, 20H, aryl), 8.05 (d,1H, aryl).

[0139] IR (KCr disc), 1608, 1579, 1509, 1287, 1251, 1232, 1112, 1092,1064, 1035, 826, 754, 703, 613 cm⁻¹.

EXAMPLE 5 Synthesis ofN-[O-(tert-butyldiphenylsilyl)-2-oxyethyl]-N-[O-(3-(O-(4,4′-dimethoxytrityl)-1-oxymethyl)-4nitrophenyl)-3-oxypropyl]-N-[O-(3-carboxylatopropionyl))-2-oxyethyl]amine (Compound7, Scheme 1)

[0140] To compound 6 (5.64 g, 6.59 mmol) dissolved in anhydrousdichloromethane (40 ml) and anhydrous pyridine (50 ml) was addedsuccinic anhydride (2.06 g 20.6 mmol) and dimethylaminopyridine (210 mg,1.72 mmol) and the flask stoppered. The reaction was then stirred atroom temperature for a total of 72 hours. T.l.c. analysis(methanol/ethyl acetate, 10%/90%) showed the formation of a new tritylcontaining species, R_(F) 0.45 and the disappearance of the startingmaterial. The solvent was removed by rotary evaporation with the lasttraces of pyridine being removed by co-evaporation with toluene (x2).The resultant gum was then partitioned between chloroform and water. Theorganic phase was separated and the aqueous phase further extracted withchloroform (x1). The organic phases were then combined and washed withbrine (x1). The chloroform solution was then dried with anhydrousmagnesium sulphate and evaporated to a gum. The last traces of solventwere then removed by placing the gum under high vacuum for one hour togive compound 7, 6.75 g. This product was used in the next step withoutfurther purification.

[0141]¹H NMR (270 MHz, CDCl₃, δ): 1.0 (s, 9H, ¹Bu), 1.9 (m, 2H,CH₂CH₂CH₂), 2.5 (m, 4H, COCH₂COOH), 2.7 (m, 6H, N—CH₂), 3.7 (m, 2H,CH₂OSi), 3.75 (s, 6H, DMT-OCH₃), 4.1 (m, 4H, CH₂OCO and Ar—OCH₂, 5.6 (s,2H, ArCH₂,ODMT), 6.7 (d, 1H, aryl), 6.8 (d,4H, aryl) 7.2→7.7 (m, 20H,aryl), 8.02 (d, 1H, aryl).

[0142] IR (CHCl₃ solution) 1736, 1608, 1579, 1509, 1288, 1251, 1232,1175, 1158, 1112, 1093, 1065, 1035, 755, 703 cm⁻¹.

EXAMPLE 6 Synthesis ofN-[O-tert-butyldiphenylsilyl)-2-oxyethyl]-N-[O-(3-(O-(4,4-dimethoxytrityl)-1-oxymethyl)-4-nitrophenyl)-3-oxypropyl]-N-[(O-(succinyl(3-carboxylatopropionyl)))-2-oxyethyl] amine (Compound 8, Scheme 1)

[0143] To compound 7 (2.99 g, 3.13 mmol) dissolved in anhydrousdichloromethane (30 ml) was added dicyclohexylcarbodiimide (0.710 g,3.45 mmol) and N-hydroxy succinimide (0.396 g, 3.44 mmol) and the flaskstoppered. The reaction mixture was then allowed to stir at roomtemperature for 18 hours during which time a white precipitate formed.The white precipitate was filtered off and the dichloromethane solutionwashed with water (x1) and brine (x1). The dichloromethane solution wasthen dried over anhydrous magnesium sulphate and the solvent rotaryevaporated off to give a foam, 3.26 g (99%). T.l.c. analysis (ethylacetate) showed only one trityl containing species, R_(F) 0.74 and nosignificant containment. Attempts to provide an analytical sample bypassing a small amount of material down a silica gel column resulted inthe decomposition of the active ester back to the acid (Compound 7). Thematerial was therefore used in all further equipments without furtherpurification.

[0144]¹H NMR (270 MHz, CDCl₃, δ): 1.04 (s, 9H, ¹Bu), 1.97 (m, 2H,CH₂CH₂CH₂), 2.50→2.75 (m, 6H, succinyl CH₂+—OCCH₂), 2.76 - 2.86 (m, 6H,NCH₂), 3.08 (m, 2H, CH₃CO₂ succinyl), 3.77 (s, 6H, DMTOCH₃), 3.86 (m,2H, CH₂OSi), 4.1→4.2 (m, 4H, ArOCH₂+CH₂O₂C), 4.63 (s, 2H, ArCH₂ODMT,6.7→6.9 (m, 5H, aryl), 7.01→7.7 (m, 20H, aryl) 8.05 (d, 1H, aryl).

[0145] IR (Br disc), 1742, 1713, 1509, 1288, 1251, 1211, 1175, 1090,1067 cm⁻¹.

EXAMPLE 7 Derivatised long chain alkylamino controlled pore glass(Compound 9, Scheme 2)

[0146] Long chain alkylamino controlled pore glass (Sigma Chemical Co,3.5 g) was pre-treated with trichloroacetic acid (1.5 g) dissolved indichloromethane (50 ml) for 2½ hours, washed with aliquots ofdichloromethane (100 ml total) and anhydrous ether (100 ml total) anddried in vacuo. To the CPG support was then added anhydrous pyridine (35ml), dimethylamino-pyridine (42 mg, 344 μmol), triethylamine (280 μl,201 mmol) and compound (8) (see scheme 1) (736 mg, 700 μmol). Themixture was then gently agitated for a total of 18 hours after whichtime the beads were given multiple washes of pyridine (7×10 ml),methanol (5×15 ml) and chloroform (5×15 ml) and then dried in vacuo.

EXAMPLE 8

[0147] Methylation of the tertiary amino groups attached to the CPGsupport (Compound 10, Scheme 2)

[0148] To the derivatised long chain alkylamino controlled pore glass(Compound 9, Scheme 2) (1.01 g) was added anhydrous THF (10 ml) andiodomethane (0.5 ml, 8 mmol). The mixture was then gently agitated for atotal of 18 hours after which time the beads were given multiple washesof anhydrous THF (5×10 ml) and then dried in vacuo.

EXAMPLE 9 Synthesis of mono protected 1,3 - propanediol derivatives(Compounds 12a and 12b, Scheme 3)—general procedure

[0149] To sodium hydride (1.05 g of a 60% dispersion in oil, 26.3 mmol)under N₂ was added anhydrous THF (10 ml) followed by dropwise additionof the 1,3-propanediol derivative (26.3 mmol) dissolved in anhydrous THF(20 ml). Stirring for an additional 30 minutes under N₂ ensured alkoxideformation as noted by the formation of a grey precipitate. The generatedalkoxide was quenched by the dropwise addition of tert -butylchlorodiphenylisilane (7.24 g, 26.3 mmol) dissolved in anhydrousTHF (20 ml) followed by stirring of the reaction under N₂ for a further40 minutes. The THF was then removed by rotary evaporation and theresidue partitioned between dichloromethane and 0.1 M sodium bicarbonatesolution. The dichloromethane solution was separated off and washed withbrine (x1). The dichloro-methane solution was then dried over magnesiumsulphate and evaporated down to an oil. This oil was applied to a silicagel column (16 cm×5 cm) which was eluted with an ether/pet. ether 40→60°C., 30%170% mixture. The product fractions were combined and rotaryevaporated down to provide the desired 1,3-propanediol derivative.

[0150] For individual details of the compounds see below.

[0151] 12a 1-O-tert-butyldiphenysilyl-1,3-propanediol, white crystallinesolid, R_(F) 0.21 ether/pet. ether 40→60° C., 30%170%, 7.61 g, 92%, mpt.40→42° C.

[0152] IR (KBr disc) 3400, 1448, 1112, 822, 734, 702, 689, 506, 488cm⁻¹.

[0153]¹H NMR (270 MHz, CDCl₃, δ): 1.06 (s, 9H, ¹Bu), 1.80 (m,2H, CH₂CH₂CH₂), 2.45 (t, 1H, OH), 3.84 (m, 4H,OCH₂CH₂CH₂O—), 7.40 (m, 6H, aryl),7.68 (m,4H, aryl).

[0154] 12b 2-methyl-1-O-tert-butyldiphenylsilyl-1,3-propanediol.Colourless oil, R_(F) 0.21 ether/pet. ether 40→60° C., 30%/70%, 6.60 g,77%.

[0155] IR (thin film) 3400, 1472, 1428, 1087, 1040, 823, 740, 702 cm⁻¹.

[0156]¹H NMR (270 MHz, CDCl₃, δ): 0.82 (d, 3H, J =6.87 Hz, CH₂), 1.06(s,9H, ¹Bu), 2.0 (m, 1H, CH—CH₃), 2.68 (t, 1H, OH), 3.64 (m, 4H, CH₂ CH(CH₃) CH₂), 7.40 (m,6H, aryl), 7.68 (m, 4H, aryl).

[0157] See P G McDougal et at JOC, 51, 3388 (1986) for generalprocedures for the monosilylation of symmetric 1,n-diols.

EXAMPLE 10 Synthesis of the treslate derivatives (Compounds 13a and 13b,Scheme 3)-general procedures

[0158] To the alcohol derivative (4.94 mmol) dissolved in anhydrousdichloromethane (10 ml) and dry triethylamine (0.84 ml 6.03 mmol) underN₂ and cooled to between −15→30° C. was added the tresylchloride (1 g,5.48 mmol) in anhydrous dichloromethane (5 ml) dropwise over a 20→40minutes. Stirring for an additional 30 minutes under N₂ at −15→30° C.completed the reaction. The reaction mixture was then transferred to aseparatory funnel and washed with ice cooled 1.0 M hydrochloric acid(x1), ice cooled water (x1) and ice cooled brine (x1). Thedichloromethane solution was then dried over magnesium sulphate and thesolvent rotary evaporated off to give the treslate. The treslates werestored at −20° C. under N₂ until required.

[0159] For individual details of the compounds see below.

[0160] 13a 1-O-tert-butyldiphenylsilyl-3-O-tresyl-1.3-propanediol. Whitecrystalline solid, 1.74 g, 77% mpt. 34→35° C. Three ml of this reactionmixture was removed prior to work up of the reaction for addition toother reactions.

[0161]¹H NMR (270 MHz, CDCl₃, δ): 1.06 (s, 9H, ¹Bu), 1.97, (m, 2H,CH₂CH₂CH₂), 3.77 (t, 2H, J=5.49 Hz, CH₂—O)—Si), 3.84 (q, 2H, J=8.79 Hz,CF₃-CH₂—O), 4.54 (t, 2H, J=6.05 Hz, Tresyl O-CH₂), 7.42 (m, 6H, aryl),7.64 (m, 4H, aryl).

[0162] IR (KBr disc) 1386, 1329, 1274, 1258, 1185, 1164, 1137, 1094,941, 763, 506cm⁻¹.

[0163] 13b 2-methyl-1-O-tert-butyldiphenylsilyl-3-O-tresyl-1,3-propanediol. Colourless oil, 2.57 g, 99%.

[0164]¹H NMR (270 MHz, CDCl₃, δ): 0.97 (d, 3H, J=6.87 Hz, CH₃), 1.06 (s,9H, ¹Bu), 2.10 (m, 1H, CHCH₃), 3.6 (m, 2H, CH₂OSi), 3.8 (q, 2H, J=8.79Hz, CF₃ CH₂), 4.40 (m, 2H, Tresyl-O-CH₂), 7.40 (m, 6H, aryl), 7.64 (m,4H, aryl).

[0165] For general details of Treslates see R K Crossland et al JACS,93, 4217 (1971).

EXAMPLE 11 Synthesis ofN-[acetoxy-2-oxyethyl]-N-[O-(3(O-(4,4′-dimethoxytrityl)-1-oxymethyl)-4-nitrophenyl)-3-oxypropyl]-N-2-hydroxyethyl]amine(Compound 15, Scheme 4)

[0166] To compound 11 (1.72 g, 1.92 mmol) dissolved in anhydrous THF (20ml) was added tetrabutylammonium fluoride (0.55 ml of a 1 M solution inTHF, 1.92 mmol). The reaction was then stirred for a total of two hoursat room temperature. The reaction mixture was then diluted with water(50 ml) and the THF removed by rotary evaporation. The aqueous solutionwas then extracted with chloroform (x1). The organic solution was driedover anhydrous sodium sulphate and evaporated down to a gum. The productwas purified by silica gel chromatography eluting the column with ethylacetate. Product fractions were combined and rotary evaporated down togive compound 12 as a colourless gum which slowly crystallised onstanding; 0.73 g, 58%, mpt. 95→97° C., R_(F) 0.26 ethyl acetate.

[0167]¹H NMR (270 MHz, CDCl₃, δ): 1.75 (brs, 1H, OH), 2.0→2.1 (m, 5H,O₂CCH₃+CH₂CH₂CH₂), 2.70→2.81 (m, 6H, CH₂N), 3.58 (m, 2H, CH₂OSi), 3.79(s, 6H, DMT-OCH₃), 4.17 (m, 4H, CH₂O), 4.64 (s, 2H, ArCH₂ODMT), 6.83 (d,4H, DMT-aryl) 7.2→7.5 (m, 10 1H, aryl), 7.69 (s, I H, aryl), 8.10 (d,1H, aryl).

[0168] IR (KBr disc), 3459, 1738, 1608, 1577, 1506, 1444, 1313, 1288,1250, 1230, 1175, 1154, 1070, 1035, 984 cm⁻¹.

EXAMPLE 12 Synthesis of N-[O-(tert-butyldiphenylsilyl)-2-oxyethyl]-N-10-(3[O-(4,4′-dimethoxytrityl)-1-oxymethyl)-4-nitrophenyl)-3-oxypropyl]-N-[acetoxy-2-oxyethyl] amine (Compound 14, Scheme 4)

[0169] To compound 6 (1.73 g, 2.02 mmol) dissolved in anhydrous pyridine(10 ml) was added acetic anhydride (0.5 ml, 4.54 mmol) and4-dimethylaminopyridine (55 mg, 0.45 mmol) and the flask stoppered. Thereaction mixture was then stirred at room temperature for a total of 16hours after which time t.l.c. analysis (methanol/ethyl acetate 5%/95%)showed the complete disappearance of the starting material and theformation of a new trityl containing spot, R_(F) 0.80. The pyridine wasremoved by rotary evaporation with the last traces being removed withco-evaporation with toluene (x2). The resultant gum was partitionedbetween chloroform and water. The chloroform solution was separated offand washed with brine (x1). The chloroform solution was then dried overanhydrous magnesium sulphate and the solvent rotary evaporated off togive a colourless gum, 1.94 g. This material was pure enough to be usedin the next reaction without any further purification.

[0170]¹H NMR (270 MHz, CDCl₃δ): 1.04 (s, 9H, ¹Bu), 1.9 (m, 2H,CH₂CH₂CH₂, 2.01 (s, 3H, —O₂CCH₃), 2.74 (m, 6H, CH₂N), 3.7 (m, 2H,CH₂OSi), 3.8 (s, 6H, DMT-OCH₂), 4.1 (m,4, CH₂O), 4.63 (s, 2H,ArCH₂ODMT), 6.78 (d, 1H, aryl) 6.83 (d, 4H, DMT aryl), 7.2→7.8 (m 20Haryl), 8.05 (d, 2H, aryl)

EXAMPLE 13 Synthesis ofN-[acetoxy-2-oxyethyl]-N-[O-(3(O-(4,4′-dimethoxytrityl)-1-oxymethyl)-4-nitrophenyl)-3-oxypropyl]-N-[O-(tert-butyldiphenylsilyl)-3-oxo-6-oxyhexyl]amine (Compound 16, Scheme 4)

[0171] To compound 12 (66 mg, 0.10 mmol) dissolved in anhydrousacetonitrile (5 ml) was added potassium carbonate (55 mgs, 0.4 mmol andcompound 13a (1 ml of the reaction mixture, approximately 0.30 mmol) andthe flask then stoppered with a calcium chloride drying tube. Thereaction mixture was then stirred at room temperature for a total of 22hours after which time the potassium carbonate was filtered off and thesolvent removed by rotary evaporation. The resultant oil was thenapplied to a silica gel column (14 cm×1 cm) and the product eluted offwith an ether/pet. ether 40→60° C., 75%/25% mixture. The pure productfractions were combined and evaporated down to a clear gum, 6 mg, 6%,R_(F) 0.47 in ether/ pet. ether 40→60° C., 80%120%.

[0172]¹H NMR (270 MHz, CDCl₃, δ): 1.05 (s, 9H, ¹Bu), 1.8 (m. 2H,CH₂CHCH₂OSi), 1.9 (m, 5H, O₂CCH₃+ArOCH₂—), 2.76 - 2.92 (m, 6H, CH₂N),3.51 (t, 2H, J=6.6 Hz, OCH₂CH₂CH₂OSi), 3.79 (s, 6H, DMT-OCH₃) 3.85 (m,2H, CH₂OSi), 4.12→4.23 (m, 4H, ArOCH₂+NCH₂CH₂OCOCH₃), 4.64 (s, 2H,ArCH₂ODMT), 6.83 (m, 5H, 1 aryl+DMT-aryl), 7.23→7.50 (m, 16H, aryl),7.68 (m, 4H, aryl), 8.10 (d, 1H, J=9.06 Hz, aryl).

[0173] By analogues reaction conditions to the above the followingcompound has also been synthesised utilising the treslate 13b.

[0174] N-[acetoxy-2-oxyethyl]-N-[O-(3(O(4 ,4′- dimethoxytrityl) -1-oxymethyl)-4-nitrophenyl)-3-oxypropyl]-N[O-(tert-butyldiphenylsilyl)-5-methyl-3-oxo-6-oxyhexyl]amine. The compound is a clear gum, R_(F) 0.53 in ether/pet. ether40→60° C., 80%120%.

[0175]¹H NMR (270 MHz, CDCl₃, δ): 0.88 (d, 3H, CH—CH₃), 1.00 (s, 9H,¹Bu), 1.9→2.1 (m, 6H, O₂CCH₃+CH—CH₃+CH₂CH₂CH₂), 2.7→3.0 (m, 6H, CH₂N),3.4→3.7 (m, 4H, CH₂O—), 3.79 (s, 6H, DMT-OCH₃), 4.0→4.4 (m, 6H, CH₂O—),4.64 (s, 2H, Ar CH₂ODMT), 6.83 (m, 5H, aryl), 7.2→7.7 (m, 20H, aryl),8.01 (d, 1H, aryl).

EXAMPLE 14 Synthesis of oligonucleotides on solid supports

[0176] Controlled pore glass carrying linkers 9 and (compounds 9 and 10in Scheme 2) was loaded into the columns used in the automaticoligonucleotide synthesiser (ABI 381A); the amounts used provided for0.2 or 1 μmol scale synthesis. The columns were inserted in theautomatic synthesiser which was then programmed for appropriate cycles.Two different types of nucleotide precursors were used: normalphosphoramidites, with dimethoxytrityl protecting groups on the 5′hydroxyls; “reverse synthons” with 5′ phosphoramidites anddimethoxytrityl protecting groups on the 3′ hydroxyls. A list ofoligonucleotides synthesised on these supports in shown in Table 4 inwhich R9 and R10 derive from compounds 9 and 10 respectively. Yieldswere monitored from the amount of dimethoxytrityl group released at eachcoupling. These yields corresponded to those obtained on the CPGsupports used for conventional oligonucleotide synthesis. TABLE 4 Endgroup(s) Sequence Normal direction Reverse direction R9 T₅ ✓ R10 T₅ ✓R10, DMT T₅ ✓ R9, DMT T₅ ✓ R10, DMT T₅ ✓ R10 A₁₀ ✓

EXAMPLE 15 Synthesis of tags under conditions which leave the analyteintact.

[0177] After synthesising 5′R9T₅ on support 9, the solid support wasdivided, part was treated with 5 mM tetrabutylammonium fluoride in THFfor 10 min. at room temperature to remove the t-butyldiphenylsilylprotecting group. Both samples were treated with 29% ammonia at roomtemperature overnight to remove the products from the solid support.Ammonia was removed under vacuum, and the solid residue dissolved inwater. HPLC showed the successful removal of the silyl protecting groupwith retention of the DMT group. This example shows that the twoprotecting groups can be removed under conditions which leave the otherin place; and further, that removal of the protecting groups leaves theoligonucleotide chain intact.

EXAMPLE 16

[0178] Biochemical reactions of tagged analytes.

[0179] 16a. Enzymatic phosphorylation of tagged oligonucleotides. Formany purposes, it will be useful to have oligonucleotides which have aphosphate group at the 5′ end. Such a group is necessary if theoligonucleotide is to be used as the donor in a ligation reaction; andit is a useful way of introducing a radioactive group to testbiochemical properties.

[0180] The oligonucleotides A₅, A₁₀, and T₅ were made with the tags R9and R10 attached to the 3′ ends, with and without the silyl protectinggroup removed (This was achieved by treating the oligonucleotide, stillon the solid support, with a 5 mM solution of tetrabutylammoniumfluoride in acetonitrile, at room temperature for 15 min.) Theseoligonucleotides were phosphorylated using T4 polynucleotide kinase andgamma-³³P-ATP using standard protocols recommended by the supplier. Thinlayer chromatography of the products on polyethyleneimine (PEI)impregnated cellulose developed in 0.5 M ammonium bicarbonate showed ineach case that the labelled phosphorus had been transferred almostcompletely to the oligonucleotide.

[0181] 16b. Ezymatic ligation of tagged oligonucleotides.

[0182] For some applications of tagged oligonucleotides, it will beuseful to ligate them to a receptor. We have shown that taggedoligonucleotides can take place in enzymatic ligation by the followingtests:

[0183] (1) Oligonucleotides tagged at the 5′-end. In this test, thetemplate was 5′ ATCAAGTCAGAAAAATATATA (SEQ ID No. 1). This washybridised to the donor, 3 TAGTTCAGTC (SEQ ID No. 2), which had beenphosphorylated at its 5′-end using radioactive phosphorus. Four ligationreactions were carried out, each with a modification of the sequence T₅,which could ligate to the 5′ phosphorylated end of the donor afterhybridising to the run of 5 A-s in the template. The four oligoT's usedin the reactions differed in the nature of their 5′-end. One had adimethoxytrityl group attached through the hydroxyl. The second andthird had tags R9 and R10 attached to the 5-end through a phosphodiesterbond. The fourth was a positive control, with a normal 5′OH. A negativecontrol lacked any oligoT. Ligation reactions were performed using T4ligase according to the suppliers' instructions. Reactions were analysedby TLC on PEI-cellulose, developed in 0.75 M ammonium bicarbonatesolution. All four reactions showed an additional spot on thechromatogram, of lower mobility than the donor; as expected, thenegative control showed no additional spot. This illustrates howoligonucleotides with different tags can take part in sequence-specificligation reactions.

[0184] Cozzarelli et al (1967) have shown that polynucleotides attachedto solid supports can be ligated to an acceptor in the presence of acomplementary template.

EXAMPLE 17

[0185] Hybridisation of tagged oligonucleotides to oligonucleotidestethered to a solid support.

[0186] Example 16b shows that tagged oligonucleotides can take part inligation reactions, inferring that they can also take part in duplexformation in solution, as ligation depends on this process. Thefollowing experiment shows that they can also form duplexes witholigonucleotides tethered to a solid support. T₁₀ was synthesised on thesurface of a sheet of aminated polypropylene according to themanufacturer's instructions. It is known that this process yields around10 pmols of oligonucleotide per mm². A solution in 3.5 Mtetramethylammonium chloride of A₁₀ (65 pmol per microlitre), labelledat the 5′ end with ³³P, and tagged at the 3′ with R10 was laid on thesurface of the derivatised polypropylene and left overnight at 4′. Afterwashing in the hybridisation solvent, it was found that around one thirdof the probe had hybridised to the tethered oligo-dT. This is close tothe theoretical limit of hybridisation, showing that taggedoligonucleotides can take part in hybridisation reactions with highefficiency.

EXAMPLE 18

[0187] Photolysis of tags.

[0188] The potential to remove tags by photolysis would greatly enhancetheir usefulness: it would allow for direct analysis by laser desorptionin the mass spectrometer; it would provide a simple method of removingthe tags to allow other biochemical or chemical processes.

[0189] 18a. Bulk photolysis.

[0190] The nitrobenzyl group is known to be labile to irradiation at 305nm. Solutions of R10A₁₀ and R10T₅ in water were irradiated at 2 cm. froma transilluminator for 20 min. under conditions known to cause nodetectable damage to nucleic acids. Analysis by HPLC showed the expectedproducts of photocleavage, with no detectable residue of the originalcompound.

[0191] 18b. Laser induced photolysis in the mass spectrometer.

[0192] Samples of R10T₅ and T₅R10 were deposited on the metal target ofa time of flight mass spectrometer (Finnigan Lasermat) without addedmatrix. The spectrum showed a single saturated peak at around mass 243in the positive mode that was absent in other samples.

EXAMPLE 19

[0193] Identification by mass spectrometry of different tags attached todifferent analytes.

[0194] A sequence of five thymidine residues with a dimethoxytritylgroup attached as a tag to the 3′ end was synthesised by conventionalsolid phase methods, but using “reverse synthons”. In the massspectrometer, this compound gave a large and distinct peak at mass 304,in the positive ion mode. By contrast a sequence of ten adenosineresidues carrying the tag designated R10 above gave a large and distinctpeak at mass 243 in the positive ion mode. In both cases, laserdesorption was carried out in the absence of matrix. In both cases thepeaks are absent from the oligonucleotides which have no tag. Theseexamples show that it is easily possible to identify an analyte sequencefrom the presence of a peak in the mass spectrometric trace that derivesfrom a tag incorporated during the synthesis of the analyte, and thatcharacteristic tags are readily identified by their different mass.

FIGURE LEGENDS

[0195]FIG. 1. General scheme for synthesis of molecules with specifictags.

[0196] Synthesis starts from a linker (L) with at least one site for theaddition of groups for synthesising the analyte and one for synthesisingthe tag. (The linker may also be attached reversibly to a solid supportduring synthesis, and may have sites for generating groups such ascharged groups which may help in analysis). P_(a) and P_(r) aretemporary protecting groups for the analyte precursors and the reportersrespectively; they will be removable by different treatments. Forexample, P_(a) may be an acid or base labile group such as trityl, F-MOCor t-BOC, and P_(r) a group removable by treatment with fluoride such asa silyl residue. Groups U-Z may also have protecting groups which mustbe stable to the reagents used to remove P_(A) and P_(R). Couplingchemistries will be different for different analyte types; standardmethods are available for oligonucleotide and peptide synthesis.

[0197] Three different types of tags are described in FIG. 2. For thefirst scheme, each extension of the tag is carried out with a reporterwhich is specific for both position and type of residue added to theanalyte. Capping is not important for this scheme.

[0198] In the second and third schemes, position is defined by the totalmass of reporter reached at the stage in synthesis when the residue isadded to the analyte. In this case it is important to terminate part ofthe extension of the tag by capping a portion of the molecules. Thesecond and third schemes differ from each other in the way the reportersare added. In the second they are in the extension agents; in the thirdthey are in the caps.

[0199]FIG. 2. Three types of molecule-specific tags. A. Illustrates tagsmade of reporters (E) that specify both position (subscript) andidentity (superscript) of the groups in the analyte (U-Z). Such a setcould comprise a series of aliphatic chains of increasing formula weightto specify position: for example, methylene for position 1, ethylene for2, propylene for 3 etc. These could be differentiated intogroup-specific types by different isotopic compositions of carbon andhydrogen: for example, there are six different isotopic compositions ofCH₂, as shown in Table 1 above. Four of these differ by one mass unitand should be readily distinguished by mass spectrometry. Other ways ofdifferential labelling can be envisaged. For example, either position orgroup could be marked by reporter groups with different charges. Suchgroups can be separated and recognised by a number of methods includingmass spectrometry. B. Shows tags made by partial synthesis, such thatany structure of the analyte is attached to a series of tags; the firstmember of the series has a reporter group specific for the first groupof the analyte; the second has the first reporter plus a second reporterspecific for the second group of the analyte and so on. Such a seriescan readily be made by using two kinds of precursor for extending thetag: one which is protected by a reversible blocking group and one whichprevents further extension. For example, a mixture of RX andP—(CH₂)_(n)X, where R is an non-reactive aliphatic group such as methylor ethyl, P is a reversible protecting group and X is an activatedresidue that can react with the group protected by P. Those moleculeswhich have been “capped” by the non-reactive aliphatic group will nottake part in the next round of deprotection and extension.

[0200] In B the group-specific information is contained in the residuesused to extend the synthesis. As in A, the information could be providedusing mass isotopes. For example, every addition of a CH₂ residuelabelled with the isotopes of C and H to P—(CH₂)_(n)X, adds furthersites that can provide different mass to the reporter. The masses of the(CH₂)_(n) range from 14n to 17n and there are 4+3(n-1) different massesin the range. Thus for the ethylene group there are seven distinctmasses in the range 28 to 34, and for the propylene group, ten in therange 42 to 51.

[0201] C. Shows how the group-specific information can be added in adifferent way; in this case it is contained in the chain terminator, the“cap” in example

[0202] B. Again, different masses could be provided by labelling analiphatic residue. Positional information is provided by the length ofthe extension at which the terminator was added. Suppose that E is(CH₂)₂—O, and the terminators are isotopically labelled methyl groupswith formula weights from 15 to 19. Each extension will add 44 massunits to the reporter. The mass range for the shortest reporter would befrom 44+15=59 to 44+19=63. The range for the second position would befrom 88+15=103 to 88+19=107, and so on to the sixth where the range isfrom 284+15=299 to 284+19=303. There is no overlap in this range, and itcan be seen that the number of reporters and the range could be extendedby using terminators and extensions with more atoms.

LITERATURE CITED

[0203] 1. Brenner, S. and Lerner, R. A. (1992). Encoded combinatorialchemistry. Proc. Natl. Acad. Sci. USA 89: 5381--5383

[0204] 2. Drmanac, R., Labat, I., Brukner, I., and Crkvenjakov, R.(1989). Sequencing of megabase plus DNA by hybridization: Theory of themethod. Genomics 4: 114--128.

[0205] 3. Pillai, V. N. R. (1980). Photoremovable protecting groups inorganic chemistry. Synthesis 39: 1-26

[0206]4. Hoheisel, J. D., Maier, E., Mott, R., McCarthy, L., Grigoriev,A. V., Schalkwyk, L. C., Nitzetic, D., Francis, F. and Lehrach, H.(1993) High resolution cosmid and P1 maps spanning the 14 Mbp genome ofthe fission yeast Schizosaccharomyces nombe. Cell 73: 109-120.

[0207] 5. Khrapko, K. R., Lysov, Yu. P., Khorlyn, A. A., Shick, V. V.,Florentiev, V. L., and Mirzabekov. (1989). An oligonucleotidehybridization approach to DNA sequencing. FEBS Lett. 256: 118-122.

[0208] 6. Patchornik, A., Amit, B. and Woodward, R. B. (1970).Photosensitive protecting groups. J. AMER. Chem. Soc. 92:21: 6333-6335.

[0209] 7. Ross, M. T., Hoheisel, J. D., Monaco, A. P., Larin, Z.,Zehetner, G., and Lehrach, H. (1992) High density gridded Yac filters;their potential as genome mapping tools. In “Techniques for the analysisof complex genomes.” Anand, R. ed. (Academic Press) 137-154.

[0210] 8. Southern, E. M. (1988). Analyzing Polynucleotide Sequences.International Patent Application PCT GB 89/00460.

[0211] 9. Southern, E. M., Maskos, U. and Elder, J. K. (1992). Analysisof Nucleic Acid Sequences by Hybridization to Arrays ofOligonucleotides: Evaluation using Experimental Models. Genomics 12:1008-1017.

[0212] 10. de Vries, M. S., Elloway, D. J., Wendl, R. H., and Hunziker,H. E. (1992). Photoionisation mass spectrometer with a microscope laserdesorption source, Rev. Sci. Instrum. 63(6): 3321-3325.

[0213] 11. Zubkov, A. M., and Mikhailov, V. G. (1979). Repetitions ofs-tuples in a sequence of independent trials. Theory Prob. Appl. 24,269-282.

[0214] 12. Cozzarelli, N. R., Melechen, N. E., Jovin, T. M. andKornberg, A. (1967). BBRC, 28, 578-586.

1. A reagent comprising a) an analyte moiety comprising at least twoanalyte residues, and linked to b) a tag moiety comprising one or morereporter groups suitable for detection by mass spectrometry, excludingoligonucleotides, wherein a reporter group designates an analyteresidue, and the reporter group at each position of the tag moiety ischosen to designate an analyte residue at a defined position of theanalyte moiety.
 2. A reagent as claimed in claim 1 , wherein there isprovided a linker group to which is attached the analyte moiety and thetag moiety.
 3. A reagent as claimed in claim 1 or claim 2 , wherein theanalyte moiety is a chain of n analyte residues, and the tag moiety is achain of up to n reporter groups, the reporter group at each position ofthe tag chain being chosen to designate the analyte residue at acorresponding position of the analyte chain.
 4. A reagent as claimed inany one of claims 1 to 3 , wherein the analyte moiety is linked to thetag moiety by a photocleavable link.
 5. A reagent as claimed in any oneof claims 1 to 4 , having a formula A - L - R where A is a chain of nanalyte residues constituting the analyte moiety, L is the linker, R isa chain of up to n reporter groups constituting the tag moiety, and n is2 - 20, wherein the tag moiety contains information defining thelocation of analyte residues in the analyte moiety.
 6. A reagent asclaimed in any one of claims 2 to 5 , wherein the linker comprises anaromatic group carrying a hydroxy, amino or sulphydryl group for analytemoiety synthesis, and a reactive group for tag moiety synthesis.
 7. Areagent as claimed in claim 6 , wherein the aromatic group carrying ahydroxy, amino or sulphydryl group for analyte moiety synthesis, alsocarries an o-nitro group for photocleavage.
 8. A reagent as claimed inany one of claims 1 to 7 , wherein there is present a charged group foranalysis by mass spectrometry.
 9. A reagent as claimed in any one ofclaims 1 to 8 , wherein the analyte moiety is a peptide chain.
 10. Areagent as claimed in any one of claims 1 to 8 , wherein the analytemoiety is an oligonucleotide chain.
 11. A library of the reagentsclaimed in any one of claims 1 to 10 , wherein the library consists of aplurality of reagents each comprising a different analyte moiety.
 12. Alibrary as claimed in claim 11 , wherein the library consists of 4^(n)reagents each comprising a different analyte moiety which is a differentoligonucleotide chain of n nucleotides.
 13. A library as claimed inclaim 12 , wherein the reagents are mixed together in solution.
 14. Anassay method which comprises the steps of: providing a target substance;incubating the target substance with the library of reagents claimed inany one of claims 11 to 13 under conditions to cause at least onereagent to bind to the target substance; removing non-bound reagents;recovering the tag moieties of the or each bound reagent; and analysingthe recovered tag moieties as an indication of the nature of the analytemoieties bound to the target substance.
 15. An assay method as claimedin claim 14 , wherein the target substance is an organism or tissue orgroup of cells.
 16. A method of sequencing a target nucleic acid, whichmethod comprises the steps of: a) providing an oligonucleotideimmobilised on a support, b) hybridising the target nucleic acid withthe immobilised oligonucleotide, c) incubating the hybrid from b) withthe library claimed in claim 13 , so that an oligonucleotide chain of afirst reagent of the library becomes hybridised to the target nucleicacid adjacent the immobilised oligonucleotide, d) ligating the adjacentoligonucleotides, thus forming a ligated first reagent, e) removingother non-ligated reagents, and f) recovering and analysing the tagmoiety of the ligated first reagent as an indication of the sequence ofa first part of the target nucleic acid.
 17. A method as claimed inclaim 16 , comprising the additional steps of ci) incubating the hybridfrom f) with the library claimed in claim 13 , so that anoligonucleotide chain of a second reagent of the library becomeshybridised to the target nucleic acid adjacent the oligonucleotide chainof the first reagent, di) ligating the adjacent oligonucleotides, thusforming a ligated second reagent, ei) removing other non-ligatedreagents, and fi) recovering and analysing the tag moiety of the ligatedsecond reagent as an indication of the sequence of a second part of thetarget nucleic acid.
 18. A method as claimed in claim 16 or claim 17 ,wherein: in step a) the oligonucleotide is immobilised on the ends of aseries of pins as the support; in step b) an individual clone of targetDNA is hybridised to the oligonucleotide immobilised on each individualpin; in steps c) and d) there are formed a series of ligated reagents,with different pins carrying different ligated reagents; and in step f)the tag moiety of each ligated reagent is recovered and analysed as anindication of the sequence of a part of the target DNA.
 19. A method asclaimed in claim 16 or claim 17 , wherein: in step b) each individualclone of target DNA is hybridised to the oligonucleotide immobilised atan individual spaced location of the support; in steps c) and d) thereare provided a series of ligated reagents with different spacedlocations of the support carrying different ligated reagents; and instep f) the tag moiety of each ligated reagent is recovered and analysedas an indication of the sequence of a part of the target DNA.
 20. Amethod as claimed in claim 16 or claim 17 , wherein the method comprisesthe steps of: a) providing an array of oligonucleotides immobilised atspaced locations on a support, an oligonucleotide at one location beingdifferent from oligonucleotides at other locations, b) incubating thetarget nucleic acid with the array of immobilised oligonucleotides, soas to form hybrids at one or more spaced locations on the support, c)incubating the hybrids from b) with the library claimed in claim 13 , sothat an oligonucleotide chain of a reagent of the library becomeshybridised to the target nucleic acid adjacent each immobilisedoligonucleotide, d) ligating adjacent oligonucleotides, thus formingligated reagents at the one or more spaced locations on the support, e)removing other non-ligated reagents, and f) recovering and analysing thetag moiety of each ligated reagent as an indication of the sequence of apart of the target nucleic acid.
 21. A method as claimed in claim 20 ,wherein the sequence is known of the oligonucleotide immobilised by acovalent bond at each spaced location on the support.
 22. A method ofanalysing a target DNA, which method comprises the steps of: i)providing the target DNA immobilised on a support, ii) incubating theimmobilised target DNA from i) with a plurality of the reagents claimedclaim 10 , so that the oligonucleotide chains of different reagentsbecome hybridised to the target DNA on the support, iii) removingnon-hybridised reagents, and iv) recovering and analysing the tag moietyof each reagent as an indication of the sequence of a part of the targetDNA.
 23. A method as claimed in claim 22 , comprising the additionalsteps of: iia) incubating the hybrid from iv) with the library ofreagents claimed in claim 13 , so that oligonucleotide chains ofdifferent reagents become hybridised to the target DNA, iiia) ligatingadjacent oligonucleotides hybridised to the target DNA and removingnon-ligated reagents, and iva) recovering and analysing the tag moietyof each ligated reagent as an indication of the sequence of part to thetarget DNA.
 24. A method as claimed in claim 22 or claim 23 , whereinindividual clones of the target nucleic acid are immobilised at spacedlocations on the support, whereby in step ii) the oligonucleotide chainsof different reagents become hybridised to the target nucleic acid atdifferent spaced locations on the support.
 25. A method as claimed inany one of claims 14 to 24 , wherein each tag moiety is recovered byphotocleavage from its associated reagent.
 26. A method as claimed inany one of claims 14 to 25 , wherein the tag moiety is analysed by massspectrometry.
 27. Assay equipment comprising: a support having two ormore spaced locations thereon; individual clones of a target nucleicacid immobilised at the spaced locations on the support; and differentreagents according to claim 10 hybridised to the individual clones ofthe target nucleic acid at the spaced locations on the support.